Vocabulary Independent Oov D Vector Mach

نویسنده

  • Tommi Lahti
چکیده

In this paper, a novel Out-of-Vocabulary (OOV) word detection method relying on phoneme-level acoustic measures and Support Vector Machines (SVM) is proposed. Word level OOV scores are computed from the phoneme level in-vocabulary (IV) and OOV information provided by an HMM based speech recognizer. The OOV word decision is based on the confidence feature vector which is processed by a SVM classifier. The decision thresholds are independent of the used test vocabulary. The performance of the proposed SVM classification scheme was experimentally compared with the word and sub-word level confidence methods. The tests indicate that the SVM based OOV rejection best generalizes the performance on the test set. While all methods were found to provide a similar performance after parameter optimization on the training set, the proposed SVM classification scheme decreased the false acceptance rate on test set by 30.4% compared with the word level confidence method and experimental decision threshold values.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning units for domain-independent out-of- vocabulary word modelling

This paper describes our recent work on detecting and recognizing out-of-vocabulary (OOV) words for robust speech recognition and understanding. To allow for OOV recognition within a word-based recognizer, the in-vocabulary (IV) word network is augmented with an OOV word model so that OOV words are considered simultaneously with IV words during recognition. We explore several configurations for...

متن کامل

Speaker-independent name dialing with out-of-vocabulary rejection

In this paper we propose a system for speaker-independent name dialing in which a name enrolled by a user can be used by other members in a family or co-workers in an o ce. We use speaker-independent sub-word models during enrollment; the recognized sub-word string is later used during recognition. We also present a mechanism for rejecting out-of-vocabulary (OOV) phrases. The best in-vocabulary...

متن کامل

Replacing OOV Words For Dependency Parsing With Distributional Semantics

Lexical information is an important feature in syntactic processing like part-ofspeech (POS) tagging and dependency parsing. However, there is no such information available for out-of-vocabulary (OOV) words, which causes many classification errors. We propose to replace OOV words with in-vocabulary words that are semantically similar according to distributional similar words computed from a lar...

متن کامل

Term-dependent confidence for out-of-vocabulary term detection

Within a spoken term detection (STD) system, the decision maker plays an important role in retrieving reliable detections. Most of the state-of-the-art STD systems make decisions based on a confidence measure that is term-independent, which poses a serious problem for out-of-vocabulary (OOV) term detection. In this paper, we study a term-dependent confidence measure based on confidence normalis...

متن کامل

Optimal size, freshness and time-frame for voice search vocabulary

In this paper, we investigate how to optimize the vocabulary for a voice search language model. The metric we optimize over is the out-of-vocabulary (OoV) rate since it is a strong indicator of user experience. In a departure from the usual way of measuring OoV rates, web search logs allow us to compute the per-session OoV rate and thus estimate the percentage of users that experience a given O...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002